Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 3488021 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 379.2 MiB |
| Average record size in memory | 114.0 B |
Variable types
| Categorical | 7 |
|---|---|
| DateTime | 2 |
| Numeric | 7 |
ride_id has a high cardinality: 3488021 distinct values | High cardinality |
start_station_name has a high cardinality: 1583 distinct values | High cardinality |
start_station_id has a high cardinality: 1576 distinct values | High cardinality |
end_station_name has a high cardinality: 1623 distinct values | High cardinality |
end_station_id has a high cardinality: 1616 distinct values | High cardinality |
start_lat is highly correlated with start_lng and 2 other fields | High correlation |
start_lng is highly correlated with start_lat and 2 other fields | High correlation |
end_lat is highly correlated with start_lat and 2 other fields | High correlation |
end_lng is highly correlated with start_lat and 2 other fields | High correlation |
start_hour is highly correlated with end_hour | High correlation |
end_hour is highly correlated with start_hour | High correlation |
elapsed_min is highly skewed (γ1 = 300.4039448) | Skewed |
ride_id is uniformly distributed | Uniform |
ride_id has unique values | Unique |
start_hour has 61065 (1.8%) zeros | Zeros |
end_hour has 67813 (1.9%) zeros | Zeros |
elapsed_min has 98098 (2.8%) zeros | Zeros |
Reproduction
| Analysis started | 2022-10-26 19:05:16.495017 |
|---|---|
| Analysis finished | 2022-10-26 19:17:22.295625 |
| Duration | 12 minutes and 5.8 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 3488021 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 26.6 MiB |
| C09E4093905089BD | 1 |
|---|---|
| 04AE80D0289D063D | 1 |
| A1CD31FF5782B541 | 1 |
| B16345B4B74C0BE2 | 1 |
| 9CB0536433FEACA3 | 1 |
| Other values (3488016) |
Length
| Max length | 16 |
|---|---|
| Median length | 16 |
| Mean length | 16 |
| Min length | 16 |
Characters and Unicode
| Total characters | 55808336 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 3488021 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | C09E4093905089BD |
|---|---|
| 2nd row | 374630DB5822C392 |
| 3rd row | 4F73CA25880A1215 |
| 4th row | ECD6EE19C0CC1D31 |
| 5th row | 44D0987673B9997D |
Common Values
| Value | Count | Frequency (%) |
| C09E4093905089BD | 1 | < 0.1% |
| 04AE80D0289D063D | 1 | < 0.1% |
| A1CD31FF5782B541 | 1 | < 0.1% |
| B16345B4B74C0BE2 | 1 | < 0.1% |
| 9CB0536433FEACA3 | 1 | < 0.1% |
| F5F294CDBED71D80 | 1 | < 0.1% |
| 11316B07F77DDC80 | 1 | < 0.1% |
| 4964DAD0F44C034D | 1 | < 0.1% |
| 06AD757F7966A114 | 1 | < 0.1% |
| 1ADAA8874400E419 | 1 | < 0.1% |
| Other values (3488011) | 3488011 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| c09e4093905089bd | 1 | < 0.1% |
| f92a7c3a29e3236e | 1 | < 0.1% |
| 0785828363ee9948 | 1 | < 0.1% |
| 7de97aa765e2dacb | 1 | < 0.1% |
| 4f73ca25880a1215 | 1 | < 0.1% |
| ecd6ee19c0cc1d31 | 1 | < 0.1% |
| 44d0987673b9997d | 1 | < 0.1% |
| a80f03b56110aff8 | 1 | < 0.1% |
| d967c4fdf71ade61 | 1 | < 0.1% |
| 62da916392de2a17 | 1 | < 0.1% |
| Other values (3488011) | 3488011 |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 3490690 | 6.3% |
| 4 | 3490070 | 6.3% |
| A | 3489063 | 6.3% |
| 2 | 3488979 | 6.3% |
| C | 3488571 | 6.3% |
| D | 3488555 | 6.3% |
| B | 3488357 | 6.3% |
| 7 | 3488290 | 6.3% |
| 0 | 3488146 | 6.3% |
| 5 | 3487967 | 6.2% |
| Other values (6) | 20919648 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 34875951 | |
| Uppercase Letter | 20932385 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 3490070 | |
| 2 | 3488979 | |
| 7 | 3488290 | |
| 0 | 3488146 | |
| 5 | 3487967 | |
| 1 | 3487372 | |
| 8 | 3487254 | |
| 3 | 3486776 | |
| 9 | 3486594 | |
| 6 | 3484503 |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 3490690 | |
| A | 3489063 | |
| C | 3488571 | |
| D | 3488555 | |
| B | 3488357 | |
| F | 3487149 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 34875951 | |
| Latin | 20932385 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 4 | 3490070 | |
| 2 | 3488979 | |
| 7 | 3488290 | |
| 0 | 3488146 | |
| 5 | 3487967 | |
| 1 | 3487372 | |
| 8 | 3487254 | |
| 3 | 3486776 | |
| 9 | 3486594 | |
| 6 | 3484503 |
Latin
| Value | Count | Frequency (%) |
| E | 3490690 | |
| A | 3489063 | |
| C | 3488571 | |
| D | 3488555 | |
| B | 3488357 | |
| F | 3487149 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 55808336 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| E | 3490690 | 6.3% |
| 4 | 3490070 | 6.3% |
| A | 3489063 | 6.3% |
| 2 | 3488979 | 6.3% |
| C | 3488571 | 6.3% |
| D | 3488555 | 6.3% |
| B | 3488357 | 6.3% |
| 7 | 3488290 | 6.3% |
| 0 | 3488146 | 6.3% |
| 5 | 3487967 | 6.2% |
| Other values (6) | 20919648 |
rideable_type
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 MiB |
| classic_bike | |
|---|---|
| electric_bike | |
| docked_bike | 37688 |
Length
| Max length | 13 |
|---|---|
| Median length | 12 |
| Mean length | 12.22626469 |
| Min length | 11 |
Characters and Unicode
| Total characters | 42645468 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | classic_bike |
|---|---|
| 2nd row | electric_bike |
| 3rd row | electric_bike |
| 4th row | electric_bike |
| 5th row | classic_bike |
Common Values
| Value | Count | Frequency (%) |
| classic_bike | 2623429 | |
| electric_bike | 826904 | 23.7% |
| docked_bike | 37688 | 1.1% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| classic_bike | 2623429 | |
| electric_bike | 826904 | 23.7% |
| docked_bike | 37688 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| c | 6938354 | |
| i | 6938354 | |
| s | 5246858 | |
| e | 5179517 | |
| k | 3525709 | |
| _ | 3488021 | |
| b | 3488021 | |
| l | 3450333 | |
| a | 2623429 | 6.2% |
| t | 826904 | 1.9% |
| Other values (3) | 939968 | 2.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 39157447 | |
| Connector Punctuation | 3488021 | 8.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 6938354 | |
| i | 6938354 | |
| s | 5246858 | |
| e | 5179517 | |
| k | 3525709 | |
| b | 3488021 | |
| l | 3450333 | |
| a | 2623429 | 6.7% |
| t | 826904 | 2.1% |
| r | 826904 | 2.1% |
| Other values (2) | 113064 | 0.3% |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 3488021 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 39157447 | |
| Common | 3488021 | 8.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| c | 6938354 | |
| i | 6938354 | |
| s | 5246858 | |
| e | 5179517 | |
| k | 3525709 | |
| b | 3488021 | |
| l | 3450333 | |
| a | 2623429 | 6.7% |
| t | 826904 | 2.1% |
| r | 826904 | 2.1% |
| Other values (2) | 113064 | 0.3% |
Common
| Value | Count | Frequency (%) |
| _ | 3488021 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 42645468 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| c | 6938354 | |
| i | 6938354 | |
| s | 5246858 | |
| e | 5179517 | |
| k | 3525709 | |
| _ | 3488021 | |
| b | 3488021 | |
| l | 3450333 | |
| a | 2623429 | 6.2% |
| t | 826904 | 1.9% |
| Other values (3) | 939968 | 2.2% |
started_at
Date
| Distinct | 1671390 |
|---|---|
| Distinct (%) | 47.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 26.6 MiB |
| Minimum | 2022-07-01 00:00:01 |
|---|---|
| Maximum | 2022-07-31 23:59:52 |
Histogram with fixed size bins (bins=50)
ended_at
Date
| Distinct | 1673829 |
|---|---|
| Distinct (%) | 48.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 26.6 MiB |
| Minimum | 2022-07-01 00:01:32 |
|---|---|
| Maximum | 2022-08-03 22:17:53 |
Histogram with fixed size bins (bins=50)
| Distinct | 1583 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 26.6 MiB |
| West St & Chambers St | 15865 |
|---|---|
| W 21 St & 6 Ave | 13493 |
| Broadway & W 58 St | 12765 |
| Broadway & E 14 St | 12645 |
| 6 Ave & W 33 St | 12605 |
| Other values (1578) |
Length
| Max length | 45 |
|---|---|
| Median length | 37 |
| Mean length | 20.05504669 |
| Min length | 9 |
Characters and Unicode
| Total characters | 69952424 |
|---|---|
| Distinct characters | 69 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Melrose St & Broadway |
|---|---|
| 2nd row | E 68 St & 3 Ave |
| 3rd row | W 37 St & 10 Ave |
| 4th row | W 37 St & 10 Ave |
| 5th row | E 68 St & 3 Ave |
Common Values
| Value | Count | Frequency (%) |
| West St & Chambers St | 15865 | 0.5% |
| W 21 St & 6 Ave | 13493 | 0.4% |
| Broadway & W 58 St | 12765 | 0.4% |
| Broadway & E 14 St | 12645 | 0.4% |
| 6 Ave & W 33 St | 12605 | 0.4% |
| 12 Ave & W 40 St | 12339 | 0.4% |
| West St & Liberty St | 11961 | 0.3% |
| Broadway & W 25 St | 11774 | 0.3% |
| 1 Ave & E 68 St | 11312 | 0.3% |
| E 33 St & 1 Ave | 11166 | 0.3% |
| Other values (1573) | 3362096 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| st | 3719173 | |
| 3417269 | ||
| ave | 2248755 | 11.6% |
| w | 1013545 | 5.2% |
| e | 896906 | 4.6% |
| broadway | 235860 | 1.2% |
| park | 205452 | 1.1% |
| pl | 186167 | 1.0% |
| 6 | 177520 | 0.9% |
| 1 | 171903 | 0.9% |
| Other values (817) | 7187863 |
Most occurring characters
| Value | Count | Frequency (%) |
| 15979954 | ||
| t | 5285064 | 7.6% |
| e | 4910917 | 7.0% |
| S | 4079204 | 5.8% |
| & | 3431008 | 4.9% |
| v | 2652700 | 3.8% |
| A | 2533500 | 3.6% |
| r | 2525197 | 3.6% |
| a | 2487933 | 3.6% |
| n | 2138419 | 3.1% |
| Other values (59) | 23928528 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 31892520 | |
| Space Separator | 15979954 | |
| Uppercase Letter | 12776313 | |
| Decimal Number | 5789488 | 8.3% |
| Other Punctuation | 3471456 | 5.0% |
| Open Punctuation | 14379 | < 0.1% |
| Close Punctuation | 14379 | < 0.1% |
| Dash Punctuation | 13465 | < 0.1% |
| Control | 470 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 5285064 | |
| e | 4910917 | |
| v | 2652700 | |
| r | 2525197 | |
| a | 2487933 | |
| n | 2138419 | 6.7% |
| o | 1912414 | 6.0% |
| l | 1400313 | 4.4% |
| s | 1214941 | 3.8% |
| i | 1214542 | 3.8% |
| Other values (16) | 6150080 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 4079204 | |
| A | 2533500 | |
| W | 1379583 | 10.8% |
| E | 983213 | 7.7% |
| B | 620559 | 4.9% |
| P | 611115 | 4.8% |
| C | 503276 | 3.9% |
| M | 329771 | 2.6% |
| G | 238120 | 1.9% |
| L | 212428 | 1.7% |
| Other values (14) | 1285544 | 10.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1312380 | |
| 2 | 707925 | |
| 3 | 642804 | |
| 4 | 567095 | |
| 5 | 539356 | |
| 6 | 504794 | 8.7% |
| 8 | 451669 | 7.8% |
| 7 | 407833 | 7.0% |
| 0 | 383200 | 6.6% |
| 9 | 272432 | 4.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| & | 3431008 | |
| \ | 27204 | 0.8% |
| . | 9935 | 0.3% |
| ' | 3309 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 15979954 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 14379 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 14379 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 13465 |
Control
| Value | Count | Frequency (%) |
| 470 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 44668833 | |
| Common | 25283591 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 5285064 | 11.8% |
| e | 4910917 | 11.0% |
| S | 4079204 | 9.1% |
| v | 2652700 | 5.9% |
| A | 2533500 | 5.7% |
| r | 2525197 | 5.7% |
| a | 2487933 | 5.6% |
| n | 2138419 | 4.8% |
| o | 1912414 | 4.3% |
| l | 1400313 | 3.1% |
| Other values (40) | 14743172 |
Common
| Value | Count | Frequency (%) |
| 15979954 | ||
| & | 3431008 | 13.6% |
| 1 | 1312380 | 5.2% |
| 2 | 707925 | 2.8% |
| 3 | 642804 | 2.5% |
| 4 | 567095 | 2.2% |
| 5 | 539356 | 2.1% |
| 6 | 504794 | 2.0% |
| 8 | 451669 | 1.8% |
| 7 | 407833 | 1.6% |
| Other values (9) | 738773 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 69952424 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 15979954 | ||
| t | 5285064 | 7.6% |
| e | 4910917 | 7.0% |
| S | 4079204 | 5.8% |
| & | 3431008 | 4.9% |
| v | 2652700 | 3.8% |
| A | 2533500 | 3.6% |
| r | 2525197 | 3.6% |
| a | 2487933 | 3.6% |
| n | 2138419 | 3.1% |
| Other values (59) | 23928528 |
| Distinct | 1576 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 26.6 MiB |
| 5329.03 | 15865 |
|---|---|
| 6140.05 | 13493 |
| 6948.10 | 12765 |
| 5905.12 | 12645 |
| 6364.07 | 12605 |
| Other values (1571) |
Length
| Max length | 9 |
|---|---|
| Median length | 7 |
| Mean length | 6.999991973 |
| Min length | 6 |
Characters and Unicode
| Total characters | 24416119 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 4801.04 |
|---|---|
| 2nd row | 6896.16 |
| 3rd row | 6611.02 |
| 4th row | 6611.02 |
| 5th row | 6896.16 |
Common Values
| Value | Count | Frequency (%) |
| 5329.03 | 15865 | 0.5% |
| 6140.05 | 13493 | 0.4% |
| 6948.10 | 12765 | 0.4% |
| 5905.12 | 12645 | 0.4% |
| 6364.07 | 12605 | 0.4% |
| 6765.01 | 12339 | 0.4% |
| 5184.08 | 11961 | 0.3% |
| 6173.08 | 11774 | 0.3% |
| 6822.09 | 11312 | 0.3% |
| 6197.08 | 11166 | 0.3% |
| Other values (1566) | 3362096 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 5329.03 | 15865 | 0.5% |
| 6140.05 | 13493 | 0.4% |
| 6948.10 | 12765 | 0.4% |
| 5905.12 | 12645 | 0.4% |
| 6364.07 | 12605 | 0.4% |
| 6765.01 | 12339 | 0.4% |
| 5184.08 | 11961 | 0.3% |
| 6173.08 | 11774 | 0.3% |
| 6822.09 | 11312 | 0.3% |
| 6197.08 | 11166 | 0.3% |
| Other values (1568) | 3362100 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4041603 | |
| . | 3487987 | |
| 5 | 2541707 | |
| 6 | 2519288 | |
| 1 | 2213222 | |
| 4 | 1964112 | |
| 7 | 1882604 | |
| 2 | 1585712 | 6.5% |
| 3 | 1479184 | 6.1% |
| 8 | 1410884 | 5.8% |
| Other values (10) | 1289816 | 5.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 20928018 | |
| Other Punctuation | 3487987 | 14.3% |
| Uppercase Letter | 104 | < 0.1% |
| Space Separator | 4 | < 0.1% |
| Lowercase Letter | 4 | < 0.1% |
| Dash Punctuation | 2 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4041603 | |
| 5 | 2541707 | |
| 6 | 2519288 | |
| 1 | 2213222 | |
| 4 | 1964112 | |
| 7 | 1882604 | |
| 2 | 1585712 | 7.6% |
| 3 | 1479184 | 7.1% |
| 8 | 1410884 | 6.7% |
| 9 | 1289702 | 6.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 64 | |
| Y | 34 | |
| L | 2 | 1.9% |
| N | 2 | 1.9% |
| C | 2 | 1.9% |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 2 | |
| b | 2 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3487987 |
Space Separator
| Value | Count | Frequency (%) |
| 4 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 24416011 | |
| Latin | 108 | < 0.1% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 4041603 | |
| . | 3487987 | |
| 5 | 2541707 | |
| 6 | 2519288 | |
| 1 | 2213222 | |
| 4 | 1964112 | |
| 7 | 1882604 | |
| 2 | 1585712 | 6.5% |
| 3 | 1479184 | 6.1% |
| 8 | 1410884 | 5.8% |
| Other values (3) | 1289708 | 5.3% |
Latin
| Value | Count | Frequency (%) |
| S | 64 | |
| Y | 34 | |
| L | 2 | 1.9% |
| a | 2 | 1.9% |
| b | 2 | 1.9% |
| N | 2 | 1.9% |
| C | 2 | 1.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 24416119 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 4041603 | |
| . | 3487987 | |
| 5 | 2541707 | |
| 6 | 2519288 | |
| 1 | 2213222 | |
| 4 | 1964112 | |
| 7 | 1882604 | |
| 2 | 1585712 | 6.5% |
| 3 | 1479184 | 6.1% |
| 8 | 1410884 | 5.8% |
| Other values (10) | 1289816 | 5.3% |
| Distinct | 1623 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 26.6 MiB |
| West St & Chambers St | 15955 |
|---|---|
| W 21 St & 6 Ave | 13548 |
| Broadway & E 14 St | 12798 |
| 12 Ave & W 40 St | 12732 |
| Broadway & W 58 St | 12149 |
| Other values (1618) |
Length
| Max length | 45 |
|---|---|
| Median length | 40 |
| Mean length | 20.06063782 |
| Min length | 8 |
Characters and Unicode
| Total characters | 69971926 |
|---|---|
| Distinct characters | 70 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 15 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Myrtle Ave & Grove St |
|---|---|
| 2nd row | E 85 St & York Ave |
| 3rd row | Knickerbocker Ave & Cooper St |
| 4th row | 6 Ave & Broome St |
| 5th row | E 66 St & Madison Ave |
Common Values
| Value | Count | Frequency (%) |
| West St & Chambers St | 15955 | 0.5% |
| W 21 St & 6 Ave | 13548 | 0.4% |
| Broadway & E 14 St | 12798 | 0.4% |
| 12 Ave & W 40 St | 12732 | 0.4% |
| Broadway & W 58 St | 12149 | 0.3% |
| 6 Ave & W 33 St | 12109 | 0.3% |
| West St & Liberty St | 11997 | 0.3% |
| Broadway & W 25 St | 11803 | 0.3% |
| 1 Ave & E 68 St | 11320 | 0.3% |
| 10 Ave & W 14 St | 11274 | 0.3% |
| Other values (1613) | 3362336 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| st | 3723134 | |
| 3416295 | ||
| ave | 2243227 | 11.5% |
| w | 1003642 | 5.2% |
| e | 894450 | 4.6% |
| broadway | 233565 | 1.2% |
| park | 205400 | 1.1% |
| pl | 187321 | 1.0% |
| 6 | 176739 | 0.9% |
| 1 | 172671 | 0.9% |
| Other values (856) | 7193396 |
Most occurring characters
| Value | Count | Frequency (%) |
| 15969437 | ||
| t | 5296037 | 7.6% |
| e | 4917230 | 7.0% |
| S | 4085017 | 5.8% |
| & | 3429985 | 4.9% |
| v | 2648440 | 3.8% |
| r | 2532947 | 3.6% |
| A | 2527857 | 3.6% |
| a | 2488951 | 3.6% |
| n | 2145475 | 3.1% |
| Other values (60) | 23930550 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 31946617 | |
| Space Separator | 15969437 | |
| Uppercase Letter | 12779354 | |
| Decimal Number | 5764381 | 8.2% |
| Other Punctuation | 3470523 | 5.0% |
| Open Punctuation | 13791 | < 0.1% |
| Close Punctuation | 13791 | < 0.1% |
| Dash Punctuation | 13586 | < 0.1% |
| Control | 446 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 5296037 | |
| e | 4917230 | |
| v | 2648440 | |
| r | 2532947 | |
| a | 2488951 | |
| n | 2145475 | 6.7% |
| o | 1916320 | 6.0% |
| l | 1408503 | 4.4% |
| i | 1219050 | 3.8% |
| s | 1215386 | 3.8% |
| Other values (16) | 6158278 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 4085017 | |
| A | 2527857 | |
| W | 1371090 | 10.7% |
| E | 979514 | 7.7% |
| B | 621030 | 4.9% |
| P | 613934 | 4.8% |
| C | 504748 | 3.9% |
| M | 329569 | 2.6% |
| G | 238832 | 1.9% |
| L | 213395 | 1.7% |
| Other values (15) | 1294368 | 10.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1312924 | |
| 2 | 706773 | |
| 3 | 637460 | |
| 4 | 565198 | |
| 5 | 535168 | |
| 6 | 500442 | 8.7% |
| 8 | 448096 | 7.8% |
| 7 | 404593 | 7.0% |
| 0 | 381858 | 6.6% |
| 9 | 271869 | 4.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| & | 3429985 | |
| \ | 27276 | 0.8% |
| . | 9868 | 0.3% |
| ' | 3394 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 15969437 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 13791 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 13791 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 13586 |
Control
| Value | Count | Frequency (%) |
| 446 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 44725971 | |
| Common | 25245955 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 5296037 | 11.8% |
| e | 4917230 | 11.0% |
| S | 4085017 | 9.1% |
| v | 2648440 | 5.9% |
| r | 2532947 | 5.7% |
| A | 2527857 | 5.7% |
| a | 2488951 | 5.6% |
| n | 2145475 | 4.8% |
| o | 1916320 | 4.3% |
| l | 1408503 | 3.1% |
| Other values (41) | 14759194 |
Common
| Value | Count | Frequency (%) |
| 15969437 | ||
| & | 3429985 | 13.6% |
| 1 | 1312924 | 5.2% |
| 2 | 706773 | 2.8% |
| 3 | 637460 | 2.5% |
| 4 | 565198 | 2.2% |
| 5 | 535168 | 2.1% |
| 6 | 500442 | 2.0% |
| 8 | 448096 | 1.8% |
| 7 | 404593 | 1.6% |
| Other values (9) | 735879 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 69971926 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 15969437 | ||
| t | 5296037 | 7.6% |
| e | 4917230 | 7.0% |
| S | 4085017 | 5.8% |
| & | 3429985 | 4.9% |
| v | 2648440 | 3.8% |
| r | 2532947 | 3.6% |
| A | 2527857 | 3.6% |
| a | 2488951 | 3.6% |
| n | 2145475 | 3.1% |
| Other values (60) | 23930550 |
| Distinct | 1616 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 26.6 MiB |
| 5329.03 | 15955 |
|---|---|
| 6140.05 | 13548 |
| 5905.12 | 12798 |
| 6765.01 | 12732 |
| 6948.10 | 12149 |
| Other values (1611) |
Length
| Max length | 9 |
|---|---|
| Median length | 7 |
| Mean length | 6.999870987 |
| Min length | 5 |
Characters and Unicode
| Total characters | 24415697 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 15 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 4816.05 |
|---|---|
| 2nd row | 7146.04 |
| 3rd row | 4582.05 |
| 4th row | 5610.09 |
| 5th row | 6969.08 |
Common Values
| Value | Count | Frequency (%) |
| 5329.03 | 15955 | 0.5% |
| 6140.05 | 13548 | 0.4% |
| 5905.12 | 12798 | 0.4% |
| 6765.01 | 12732 | 0.4% |
| 6948.10 | 12149 | 0.3% |
| 6364.07 | 12109 | 0.3% |
| 5184.08 | 11997 | 0.3% |
| 6173.08 | 11803 | 0.3% |
| 6822.09 | 11320 | 0.3% |
| 6157.04 | 11274 | 0.3% |
| Other values (1606) | 3362336 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 5329.03 | 15955 | 0.5% |
| 6140.05 | 13548 | 0.4% |
| 5905.12 | 12798 | 0.4% |
| 6765.01 | 12732 | 0.4% |
| 6948.10 | 12149 | 0.3% |
| 6364.07 | 12109 | 0.3% |
| 5184.08 | 11997 | 0.3% |
| 6173.08 | 11803 | 0.3% |
| 6822.09 | 11320 | 0.3% |
| 6157.04 | 11274 | 0.3% |
| Other values (1608) | 3362340 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4042523 | |
| . | 3487681 | |
| 5 | 2551383 | |
| 6 | 2509236 | |
| 1 | 2214454 | |
| 4 | 1964647 | |
| 7 | 1881208 | |
| 2 | 1585997 | 6.5% |
| 3 | 1481140 | 6.1% |
| 8 | 1407296 | 5.8% |
| Other values (13) | 1290132 | 5.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 20927100 | |
| Other Punctuation | 3487681 | 14.3% |
| Uppercase Letter | 906 | < 0.1% |
| Space Separator | 4 | < 0.1% |
| Lowercase Letter | 4 | < 0.1% |
| Dash Punctuation | 2 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4042523 | |
| 5 | 2551383 | |
| 6 | 2509236 | |
| 1 | 2214454 | |
| 4 | 1964647 | |
| 7 | 1881208 | |
| 2 | 1585997 | 7.6% |
| 3 | 1481140 | 7.1% |
| 8 | 1407296 | 6.7% |
| 9 | 1289216 | 6.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 444 | |
| Y | 224 | |
| C | 66 | 7.3% |
| J | 64 | 7.1% |
| H | 52 | 5.7% |
| B | 52 | 5.7% |
| L | 2 | 0.2% |
| N | 2 | 0.2% |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 2 | |
| b | 2 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3487681 |
Space Separator
| Value | Count | Frequency (%) |
| 4 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 24414787 | |
| Latin | 910 | < 0.1% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 4042523 | |
| . | 3487681 | |
| 5 | 2551383 | |
| 6 | 2509236 | |
| 1 | 2214454 | |
| 4 | 1964647 | |
| 7 | 1881208 | |
| 2 | 1585997 | 6.5% |
| 3 | 1481140 | 6.1% |
| 8 | 1407296 | 5.8% |
| Other values (3) | 1289222 | 5.3% |
Latin
| Value | Count | Frequency (%) |
| S | 444 | |
| Y | 224 | |
| C | 66 | 7.3% |
| J | 64 | 7.0% |
| H | 52 | 5.7% |
| B | 52 | 5.7% |
| L | 2 | 0.2% |
| a | 2 | 0.2% |
| b | 2 | 0.2% |
| N | 2 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 24415697 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 4042523 | |
| . | 3487681 | |
| 5 | 2551383 | |
| 6 | 2509236 | |
| 1 | 2214454 | |
| 4 | 1964647 | |
| 7 | 1881208 | |
| 2 | 1585997 | 6.5% |
| 3 | 1481140 | 6.1% |
| 8 | 1407296 | 5.8% |
| Other values (13) | 1290132 | 5.3% |
| Distinct | 345152 |
|---|---|
| Distinct (%) | 9.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.74105508 |
| Minimum | 40.63333225 |
|---|---|
| Maximum | 40.88240421 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 26.6 MiB |
Quantile statistics
| Minimum | 40.63333225 |
|---|---|
| 5-th percentile | 40.67818388 |
| Q1 | 40.71534825 |
| median | 40.739323 |
| Q3 | 40.76350532 |
| 95-th percentile | 40.81613636 |
| Maximum | 40.88240421 |
| Range | 0.249071955 |
| Interquartile range (IQR) | 0.04815707 |
Descriptive statistics
| Standard deviation | 0.04010876516 |
|---|---|
| Coefficient of variation (CV) | 0.0009844802761 |
| Kurtosis | 0.3953671373 |
| Mean | 40.74105508 |
| Median Absolute Deviation (MAD) | 0.02409079 |
| Skewness | 0.4357546311 |
| Sum | 142105655.7 |
| Variance | 0.001608713042 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 40.71754834 | 14545 | 0.4% |
| 40.74173969 | 12226 | 0.4% |
| 40.73454567 | 11283 | 0.3% |
| 40.74901271 | 11220 | 0.3% |
| 40.76087502 | 11112 | 0.3% |
| 40.711444 | 11035 | 0.3% |
| 40.76695317 | 10861 | 0.3% |
| 40.74286877 | 10286 | 0.3% |
| 40.74322681 | 9850 | 0.3% |
| 40.7419816 | 9782 | 0.3% |
| Other values (345142) | 3375821 |
| Value | Count | Frequency (%) |
| 40.63333225 | 1 | |
| 40.63333511 | 1 | |
| 40.63334036 | 1 | |
| 40.63334417 | 1 | |
| 40.63334465 | 1 | |
| 40.63334596 | 1 | |
| 40.63334918 | 1 | |
| 40.6333518 | 1 | |
| 40.6333555 | 1 | |
| 40.63335681 | 1 |
| Value | Count | Frequency (%) |
| 40.88240421 | 1 | |
| 40.88232207 | 1 | |
| 40.88231158 | 1 | |
| 40.8823055 | 1 | |
| 40.88229489 | 1 | |
| 40.88229299 | 1 | |
| 40.88228619 | 1 | |
| 40.88228047 | 1 | |
| 40.88227642 | 2 | |
| 40.88227332 | 1 |
| Distinct | 320002 |
|---|---|
| Distinct (%) | 9.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -73.97434177 |
| Minimum | -74.02684856 |
|---|---|
| Maximum | -73.88126779 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 3488021 |
| Negative (%) | 100.0% |
| Memory size | 26.6 MiB |
Quantile statistics
| Minimum | -74.02684856 |
|---|---|
| 5-th percentile | -74.00876909 |
| Q1 | -73.99379969 |
| median | -73.9812206 |
| Q3 | -73.95779 |
| 95-th percentile | -73.918214 |
| Maximum | -73.88126779 |
| Range | 0.145580769 |
| Interquartile range (IQR) | 0.03600968643 |
Descriptive statistics
| Standard deviation | 0.02694649222 |
|---|---|
| Coefficient of variation (CV) | -0.0003642680904 |
| Kurtosis | 0.2384908702 |
| Mean | -73.97434177 |
| Median Absolute Deviation (MAD) | 0.01618129014 |
| Skewness | 0.8689945971 |
| Sum | -258024057.5 |
| Variance | 0.0007261134427 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -74.01322069 | 14545 | 0.4% |
| -73.99415556 | 12226 | 0.4% |
| -73.99074142 | 11283 | 0.3% |
| -73.98848395 | 11220 | 0.3% |
| -74.00277668 | 11112 | 0.3% |
| -74.014847 | 11035 | 0.3% |
| -73.98169333 | 10861 | 0.3% |
| -73.98918629 | 10286 | 0.3% |
| -73.97449784 | 9850 | 0.3% |
| -74.0083158 | 9782 | 0.3% |
| Other values (319992) | 3375821 |
| Value | Count | Frequency (%) |
| -74.02684856 | 1 | < 0.1% |
| -74.02682853 | 1 | < 0.1% |
| -74.02682436 | 1 | < 0.1% |
| -74.026823 | 456 | |
| -74.02681851 | 1 | < 0.1% |
| -74.02681649 | 1 | < 0.1% |
| -74.02681255 | 1 | < 0.1% |
| -74.02680516 | 1 | < 0.1% |
| -74.02680457 | 1 | < 0.1% |
| -74.02680278 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| -73.88126779 | 1 | < 0.1% |
| -73.88143372 | 1 | < 0.1% |
| -73.88144517 | 1 | < 0.1% |
| -73.88145 | 281 | |
| -73.88145149 | 1 | < 0.1% |
| -73.88145637 | 1 | < 0.1% |
| -73.88145816 | 1 | < 0.1% |
| -73.88146079 | 1 | < 0.1% |
| -73.88146389 | 1 | < 0.1% |
| -73.88146663 | 1 | < 0.1% |
| Distinct | 2328 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.74083298 |
| Minimum | 40.633385 |
|---|---|
| Maximum | 40.88226 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 26.6 MiB |
Quantile statistics
| Minimum | 40.633385 |
|---|---|
| 5-th percentile | 40.6777287 |
| Q1 | 40.7153379 |
| median | 40.73901691 |
| Q3 | 40.76344058 |
| 95-th percentile | 40.815484 |
| Maximum | 40.88226 |
| Range | 0.248875 |
| Interquartile range (IQR) | 0.04810268 |
Descriptive statistics
| Standard deviation | 0.04014157449 |
|---|---|
| Coefficient of variation (CV) | 0.0009852909612 |
| Kurtosis | 0.3940662367 |
| Mean | 40.74083298 |
| Median Absolute Deviation (MAD) | 0.0241370879 |
| Skewness | 0.4390702894 |
| Sum | 142104881 |
| Variance | 0.001611346002 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 40.71754834 | 15594 | 0.4% |
| 40.74173969 | 13475 | 0.4% |
| 40.73454567 | 12705 | 0.4% |
| 40.76087502 | 12281 | 0.4% |
| 40.711444 | 11997 | 0.3% |
| 40.74901271 | 11932 | 0.3% |
| 40.76695317 | 11869 | 0.3% |
| 40.74286877 | 11603 | 0.3% |
| 40.76500525 | 11305 | 0.3% |
| 40.74322681 | 11214 | 0.3% |
| Other values (2318) | 3364046 |
| Value | Count | Frequency (%) |
| 40.633385 | 323 | |
| 40.635679 | 464 | |
| 40.637033 | 378 | |
| 40.63766 | 81 | < 0.1% |
| 40.638196 | 205 | < 0.1% |
| 40.638246 | 244 | |
| 40.639421 | 576 | |
| 40.639673 | 240 | |
| 40.639859 | 166 | < 0.1% |
| 40.639978 | 75 | < 0.1% |
| Value | Count | Frequency (%) |
| 40.88226 | 382 | |
| 40.8802945 | 137 | < 0.1% |
| 40.87935 | 551 | |
| 40.87812 | 218 | < 0.1% |
| 40.877964 | 216 | < 0.1% |
| 40.87704 | 217 | < 0.1% |
| 40.87656 | 255 | |
| 40.875531 | 198 | < 0.1% |
| 40.87444 | 148 | < 0.1% |
| 40.8740704 | 143 | < 0.1% |
| Distinct | 2320 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -73.97437321 |
| Minimum | -74.08670068 |
|---|---|
| Maximum | -73.88145 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 3488021 |
| Negative (%) | 100.0% |
| Memory size | 26.6 MiB |
Quantile statistics
| Minimum | -74.08670068 |
|---|---|
| 5-th percentile | -74.00876909 |
| Q1 | -73.993915 |
| median | -73.98122549 |
| Q3 | -73.95779 |
| 95-th percentile | -73.918214 |
| Maximum | -73.88145 |
| Range | 0.2052506779 |
| Interquartile range (IQR) | 0.036125 |
Descriptive statistics
| Standard deviation | 0.0269830382 |
|---|---|
| Coefficient of variation (CV) | -0.0003647619714 |
| Kurtosis | 0.2344138803 |
| Mean | -73.97437321 |
| Median Absolute Deviation (MAD) | 0.01654599057 |
| Skewness | 0.8667193855 |
| Sum | -258024167.2 |
| Variance | 0.0007280843508 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -74.01322069 | 15594 | 0.4% |
| -73.99415556 | 13475 | 0.4% |
| -73.99074142 | 12705 | 0.4% |
| -74.00277668 | 12281 | 0.4% |
| -74.014847 | 11997 | 0.3% |
| -73.98848395 | 11932 | 0.3% |
| -73.98169333 | 11869 | 0.3% |
| -73.98918629 | 11603 | 0.3% |
| -73.95818491 | 11305 | 0.3% |
| -73.97449784 | 11214 | 0.3% |
| Other values (2310) | 3364046 |
| Value | Count | Frequency (%) |
| -74.08670068 | 2 | |
| -74.07195926 | 2 | |
| -74.071455 | 4 | |
| -74.07126188 | 1 | < 0.1% |
| -74.06762213 | 3 | |
| -74.05178863 | 1 | < 0.1% |
| -74.05044364 | 1 | < 0.1% |
| -74.04996783 | 1 | < 0.1% |
| -74.04963791 | 2 | |
| -74.04557168 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| -73.88145 | 392 | |
| -73.88187566 | 305 | |
| -73.881876 | 4 | < 0.1% |
| -73.883365 | 3 | < 0.1% |
| -73.88336509 | 143 | < 0.1% |
| -73.88366 | 255 | |
| -73.88412 | 234 | |
| -73.884308 | 487 | |
| -73.88459 | 193 | < 0.1% |
| -73.88475503 | 216 |
member_casual
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 MiB |
| member | |
|---|---|
| casual |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 20928126 |
|---|---|
| Distinct characters | 9 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | member |
|---|---|
| 2nd row | member |
| 3rd row | member |
| 4th row | member |
| 5th row | member |
Common Values
| Value | Count | Frequency (%) |
| member | 2653232 | |
| casual | 834789 | 23.9% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| member | 2653232 | |
| casual | 834789 | 23.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| m | 5306464 | |
| e | 5306464 | |
| b | 2653232 | |
| r | 2653232 | |
| a | 1669578 | 8.0% |
| c | 834789 | 4.0% |
| s | 834789 | 4.0% |
| u | 834789 | 4.0% |
| l | 834789 | 4.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 20928126 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| m | 5306464 | |
| e | 5306464 | |
| b | 2653232 | |
| r | 2653232 | |
| a | 1669578 | 8.0% |
| c | 834789 | 4.0% |
| s | 834789 | 4.0% |
| u | 834789 | 4.0% |
| l | 834789 | 4.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 20928126 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| m | 5306464 | |
| e | 5306464 | |
| b | 2653232 | |
| r | 2653232 | |
| a | 1669578 | 8.0% |
| c | 834789 | 4.0% |
| s | 834789 | 4.0% |
| u | 834789 | 4.0% |
| l | 834789 | 4.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 20928126 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| m | 5306464 | |
| e | 5306464 | |
| b | 2653232 | |
| r | 2653232 | |
| a | 1669578 | 8.0% |
| c | 834789 | 4.0% |
| s | 834789 | 4.0% |
| u | 834789 | 4.0% |
| l | 834789 | 4.0% |
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.26480316 |
| Minimum | 0 |
|---|---|
| Maximum | 23 |
| Zeros | 61065 |
| Zeros (%) | 1.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 26.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 11 |
| median | 15 |
| Q3 | 18 |
| 95-th percentile | 22 |
| Maximum | 23 |
| Range | 23 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 5.262146354 |
|---|---|
| Coefficient of variation (CV) | 0.3688902183 |
| Kurtosis | -0.1330801813 |
| Mean | 14.26480316 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -0.5799520781 |
| Sum | 49755933 |
| Variance | 27.69018425 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=24)
| Value | Count | Frequency (%) |
| 18 | 303758 | 8.7% |
| 17 | 301631 | 8.6% |
| 19 | 253148 | 7.3% |
| 16 | 245703 | 7.0% |
| 15 | 216272 | 6.2% |
| 14 | 207563 | 6.0% |
| 13 | 196836 | 5.6% |
| 20 | 193778 | 5.6% |
| 12 | 192265 | 5.5% |
| 8 | 176008 | 5.0% |
| Other values (14) | 1201059 |
| Value | Count | Frequency (%) |
| 0 | 61065 | 1.8% |
| 1 | 38895 | 1.1% |
| 2 | 25728 | 0.7% |
| 3 | 16145 | 0.5% |
| 4 | 13167 | 0.4% |
| 5 | 24608 | 0.7% |
| 6 | 65267 | 1.9% |
| 7 | 115824 | |
| 8 | 176008 | |
| 9 | 162926 |
| Value | Count | Frequency (%) |
| 23 | 89909 | 2.6% |
| 22 | 121217 | 3.5% |
| 21 | 142570 | |
| 20 | 193778 | |
| 19 | 253148 | |
| 18 | 303758 | |
| 17 | 301631 | |
| 16 | 245703 | |
| 15 | 216272 | |
| 14 | 207563 |
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.3728309 |
| Minimum | 0 |
|---|---|
| Maximum | 23 |
| Zeros | 67813 |
| Zeros (%) | 1.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 26.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 11 |
| median | 15 |
| Q3 | 18 |
| 95-th percentile | 22 |
| Maximum | 23 |
| Range | 23 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 5.350884359 |
|---|---|
| Coefficient of variation (CV) | 0.3722916102 |
| Kurtosis | -0.08061610459 |
| Mean | 14.3728309 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -0.6286312898 |
| Sum | 50132736 |
| Variance | 28.63196342 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=24)
| Value | Count | Frequency (%) |
| 18 | 308895 | 8.9% |
| 17 | 290508 | 8.3% |
| 19 | 264699 | 7.6% |
| 16 | 237117 | 6.8% |
| 15 | 213711 | 6.1% |
| 20 | 208385 | 6.0% |
| 14 | 203291 | 5.8% |
| 13 | 194851 | 5.6% |
| 12 | 187595 | 5.4% |
| 9 | 166878 | 4.8% |
| Other values (14) | 1212091 |
| Value | Count | Frequency (%) |
| 0 | 67813 | |
| 1 | 43351 | 1.2% |
| 2 | 28871 | 0.8% |
| 3 | 18338 | 0.5% |
| 4 | 13631 | 0.4% |
| 5 | 21384 | 0.6% |
| 6 | 57828 | 1.7% |
| 7 | 103790 | |
| 8 | 165186 | |
| 9 | 166878 |
| Value | Count | Frequency (%) |
| 23 | 98991 | 2.8% |
| 22 | 127860 | |
| 21 | 153649 | |
| 20 | 208385 | |
| 19 | 264699 | |
| 18 | 308895 | |
| 17 | 290508 | |
| 16 | 237117 | |
| 15 | 213711 | |
| 14 | 203291 |
| Distinct | 1510 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.70321566 |
| Minimum | -16 |
|---|---|
| Maximum | 43425 |
| Zeros | 98098 |
| Zeros (%) | 2.8% |
| Negative | 84 |
| Negative (%) | < 0.1% |
| Memory size | 26.6 MiB |
Quantile statistics
| Minimum | -16 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 5 |
| median | 10 |
| Q3 | 19 |
| 95-th percentile | 41 |
| Maximum | 43425 |
| Range | 43441 |
| Interquartile range (IQR) | 14 |
Descriptive statistics
| Standard deviation | 67.22909684 |
|---|---|
| Coefficient of variation (CV) | 4.2812312 |
| Kurtosis | 145941.1132 |
| Mean | 15.70321566 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 300.4039448 |
| Sum | 54773146 |
| Variance | 4519.751462 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 5 | 204859 | 5.9% |
| 6 | 201035 | 5.8% |
| 4 | 198237 | 5.7% |
| 7 | 191771 | 5.5% |
| 8 | 179261 | 5.1% |
| 3 | 177728 | 5.1% |
| 9 | 164785 | 4.7% |
| 10 | 150854 | 4.3% |
| 2 | 141468 | 4.1% |
| 11 | 137509 | 3.9% |
| Other values (1500) | 1740514 |
| Value | Count | Frequency (%) |
| -16 | 1 | < 0.1% |
| -13 | 1 | < 0.1% |
| -9 | 1 | < 0.1% |
| -6 | 1 | < 0.1% |
| -3 | 2 | < 0.1% |
| -1 | 78 | < 0.1% |
| 0 | 98098 | |
| 1 | 96574 | |
| 2 | 141468 | |
| 3 | 177728 |
| Value | Count | Frequency (%) |
| 43425 | 1 | |
| 41396 | 1 | |
| 36935 | 1 | |
| 24615 | 1 | |
| 23866 | 1 | |
| 23195 | 1 | |
| 22511 | 1 | |
| 18717 | 1 | |
| 17594 | 1 | |
| 17425 | 1 |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| ride_id | rideable_type | started_at | ended_at | start_station_name | start_station_id | end_station_name | end_station_id | start_lat | start_lng | end_lat | end_lng | member_casual | start_hour | end_hour | elapsed_min | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | C09E4093905089BD | classic_bike | 2022-07-23 11:34:57 | 2022-07-23 11:45:08 | Melrose St & Broadway | 4801.04 | Myrtle Ave & Grove St | 4816.05 | 40.697481 | -73.935877 | 40.699050 | -73.915160 | member | 11 | 11 | 10.0 |
| 1 | 374630DB5822C392 | electric_bike | 2022-07-29 18:19:08 | 2022-07-29 18:26:50 | E 68 St & 3 Ave | 6896.16 | E 85 St & York Ave | 7146.04 | 40.767128 | -73.962246 | 40.775369 | -73.948034 | member | 18 | 18 | 7.0 |
| 2 | 4F73CA25880A1215 | electric_bike | 2022-07-16 16:30:58 | 2022-07-16 17:39:18 | W 37 St & 10 Ave | 6611.02 | Knickerbocker Ave & Cooper St | 4582.05 | 40.756604 | -73.997901 | 40.690810 | -73.904480 | member | 16 | 17 | 68.0 |
| 3 | ECD6EE19C0CC1D31 | electric_bike | 2022-07-17 17:35:57 | 2022-07-17 18:03:36 | W 37 St & 10 Ave | 6611.02 | 6 Ave & Broome St | 5610.09 | 40.756604 | -73.997901 | 40.724310 | -74.004730 | member | 17 | 18 | 27.0 |
| 4 | 44D0987673B9997D | classic_bike | 2022-07-11 07:56:29 | 2022-07-11 07:59:15 | E 68 St & 3 Ave | 6896.16 | E 66 St & Madison Ave | 6969.08 | 40.767128 | -73.962246 | 40.768009 | -73.968453 | member | 7 | 7 | 2.0 |
| 5 | A80F03B56110AFF8 | classic_bike | 2022-07-14 19:35:53 | 2022-07-14 19:50:06 | Clinton Ave & Flushing Ave | 4762.04 | Bergen St & 4 Ave | 4322.06 | 40.697940 | -73.969868 | 40.682564 | -73.979898 | member | 19 | 19 | 14.0 |
| 6 | D967C4FDF71ADE61 | classic_bike | 2022-07-26 20:18:17 | 2022-07-26 20:26:57 | E 68 St & 3 Ave | 6896.16 | E 85 St & York Ave | 7146.04 | 40.767128 | -73.962246 | 40.775369 | -73.948034 | member | 20 | 20 | 8.0 |
| 7 | 62DA916392DE2A17 | electric_bike | 2022-07-13 06:46:50 | 2022-07-13 06:50:26 | E 89 St & York Ave | 7204.08 | E 85 St & York Ave | 7146.04 | 40.777945 | -73.946041 | 40.775369 | -73.948034 | member | 6 | 6 | 3.0 |
| 8 | DBFDF326FBAC1C0B | classic_bike | 2022-07-02 11:54:21 | 2022-07-02 11:57:11 | E 89 St & York Ave | 7204.08 | E 85 St & York Ave | 7146.04 | 40.777945 | -73.946041 | 40.775369 | -73.948034 | member | 11 | 11 | 2.0 |
| 9 | 5BB3497D14360353 | electric_bike | 2022-07-31 15:30:06 | 2022-07-31 15:34:57 | 35 Ave & 37 St | 6563.12 | 38 St & 30 Ave | 6850.01 | 40.755733 | -73.923661 | 40.764175 | -73.915840 | member | 15 | 15 | 4.0 |
Last rows
| ride_id | rideable_type | started_at | ended_at | start_station_name | start_station_id | end_station_name | end_station_id | start_lat | start_lng | end_lat | end_lng | member_casual | start_hour | end_hour | elapsed_min | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3488011 | 94DF6549B0B8B543 | electric_bike | 2022-07-30 06:38:38 | 2022-07-30 06:52:47 | W 106 St & Central Park West | 7606.01 | Grand Army Plaza & Central Park S | 6839.10 | 40.798186 | -73.960591 | 40.764397 | -73.973715 | member | 6 | 6 | 14.0 |
| 3488012 | E593817D3BF47494 | classic_bike | 2022-07-20 09:56:28 | 2022-07-20 10:11:00 | W 59 St & 10 Ave | 7023.04 | W 51 St & Rockefeller Plaza | 6700.14 | 40.770513 | -73.988038 | 40.759738 | -73.978116 | member | 9 | 10 | 14.0 |
| 3488013 | DC2D0A8E007D50A1 | classic_bike | 2022-07-07 18:28:26 | 2022-07-07 18:34:17 | W 59 St & 10 Ave | 7023.04 | Grand Army Plaza & Central Park S | 6839.10 | 40.770513 | -73.988038 | 40.764397 | -73.973715 | member | 18 | 18 | 5.0 |
| 3488014 | D345940C40C711AE | classic_bike | 2022-07-20 23:21:09 | 2022-07-20 23:28:22 | Underhill Ave & Pacific St | 4231.04 | Adelphi St & Myrtle Ave | 4620.02 | 40.680484 | -73.964680 | 40.693083 | -73.971789 | member | 23 | 23 | 7.0 |
| 3488015 | CDB429E557F530A5 | electric_bike | 2022-07-07 12:47:16 | 2022-07-07 12:55:03 | W 59 St & 10 Ave | 7023.04 | W 51 St & Rockefeller Plaza | 6700.14 | 40.770513 | -73.988038 | 40.759738 | -73.978116 | member | 12 | 12 | 7.0 |
| 3488016 | EF571F06A5E34311 | docked_bike | 2022-07-10 14:08:11 | 2022-07-10 14:36:45 | W 106 St & Central Park West | 7606.01 | Grand Army Plaza & Central Park S | 6839.10 | 40.798186 | -73.960591 | 40.764397 | -73.973715 | casual | 14 | 14 | 28.0 |
| 3488017 | 9F9F55113999F07B | electric_bike | 2022-07-27 01:47:52 | 2022-07-27 02:11:40 | Delancey St & Eldridge St | 5414.07 | Calyer St & Guernsey St | 5709.03 | 40.719383 | -73.991479 | 40.727558 | -73.955059 | member | 1 | 2 | 23.0 |
| 3488018 | CBB7EF472D45EDAD | classic_bike | 2022-07-18 19:35:34 | 2022-07-18 19:57:15 | Grand Concourse & East Mount Eden Ave | 8265.09 | Weeks Ave & E 175 St | 8340.05 | 40.843043 | -73.911753 | 40.846879 | -73.907342 | member | 19 | 19 | 21.0 |
| 3488019 | C2176A140EBE4FB5 | classic_bike | 2022-07-25 08:13:04 | 2022-07-25 08:58:29 | Court St & State St | 4488.08 | E 53 St & Lexington Ave | 6617.09 | 40.690147 | -73.992072 | 40.758281 | -73.970694 | member | 8 | 8 | 45.0 |
| 3488020 | 405C56C8930284FD | classic_bike | 2022-07-08 12:18:47 | 2022-07-08 12:27:02 | W 59 St & 10 Ave | 7023.04 | Grand Army Plaza & Central Park S | 6839.10 | 40.770513 | -73.988038 | 40.764397 | -73.973715 | member | 12 | 12 | 8.0 |